Random effects logistic regression model for anomaly detection

نویسندگان

  • Min Seok Mok
  • So Young Sohn
  • Yong Han Ju
چکیده

As the influence of the internet continues to expand as a medium for communications and commerce, the threat from spammers, system attackers, and criminal enterprises has grown accordingly. This paper proposes a random effects logistic regression model to predict anomaly detection. Unlike the previous studies on anomaly detection, a random effects model was applied, which accommodates not only the risk factors of the exposures but also the uncertainty not explained by such factors. The specific factors of the risk category such as retained ‘protocol type’ and ‘logged in’ are included in the proposed model. The research is based on a sample of 49,427 random observations for 42 variables of the KDD-cup 1999 (Data Mining and Knowledge Discovery competition) data set that contains ‘normal’ and ‘anomaly’ connections. The proposed model has a classification accuracy of 98.94% for the training data set, while that for the validation data set is 98.68%. 2010 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Determining the Minimum Sample Size of Audit Data Required to Profile User Behavior and Detect Anomaly Intrusion

Although statistical modeling techniques have been employed to detect anomaly intrusion and profile user behavior with network traffic data collected from multi-sites (IP addresses), the minimum sample size of audit data required for each site is unclear. Using the Intrusion Detection Evaluation off-line data developed by the Lincoln Laboratory at Massachusetts Institute of Technology under the...

متن کامل

A multinomial logistic regression modeling approach for anomaly intrusion detection

Although researchers have long studied using statistical modeling techniques to detect anomaly intrusion and profile user behavior, the feasibility of applying multinomial logistic regression modeling to predict multi-attack types has not been addressed, and the risk factors associated with individual major attacks remain unclear. To address the gaps, this study used the KDD-cup 1999 data and b...

متن کامل

Random Forest Classification for Android Malware

Classification techniques such as Support Vector Machines, K-Nearest Neighbours, Decision Trees, Logistic Regression and Naive Bayes have widely been used in the area of intrusion detection research in the security community. They are predominantly used for behaviour based detection methods (anomaly detection methods). In this paper we exclusively apply the ensemble learning algorithm Random Fo...

متن کامل

On-board Clutch Slippage Detection and Diagnosis in Heavy Duty Machine

In order to reduce unnecessary stops and expensive downtime originating from clutch failure of construction equipment machines; adequate real time sensor data measured on the machine in combination with feature extraction and classification methods may be utilized. This paper presents a framework with feature extraction methods and an anomaly detection module combined with Case-Based Reasoning ...

متن کامل

Anomaly Detection using Decision Tree based Classifiers

as we know that with the help of Data mining techniques we can find out knowledge in terms of various characteristics and patterns. In this regard this paper presents finding out of anomalies/ outliers using various decision tree based classifiers viz. Best-first Decision Tree, Functional Tree, Logistic Model Tree, J48 and Random Forest decision tree. Three real world datasets has been used in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2010